home *** CD-ROM | disk | FTP | other *** search
- E D I C T
- =========
-
- Public Domain Japanese/English Dictionary file, coordinated by Jim Breen.
-
- CURRENT VERSION
- ---------------
-
- The version date and sequence number is included in the dictionary itself under
- the entry "EDICT". (Actually it is under the JIS-ASCII code "????". This keeps
- it as the first entry when it is sorted.)
-
- The master copy of EDICT is in the pub/Nihongo directory of
- monu6.cc.monash.edu.au. There are other copies around, but they may not be as
- up-to-date. The easy way to check if the version you have is the latest is from
- the size/date.
-
- INTRODUCTION
- ------------
-
- EDICT is an attempt to produce a public domain Japanese/English Dictionary in
- machine-readable form. It was intended initially for use with MOKE (Mark's Own
- Kanji Editor) and related software such as JDIC and JREADER, however it has the
- potential to be used in a large number of packages.
-
- At present it is in the "public domain", however at some stage it may be placed
- under Gnu or Copyleft protection, mainly to prevent the work of its many
- contributors being exploited by commercial software developers.
-
- FORMAT
- ------
-
- EDICT is in the "EDICT" format used by MOKE. It uses EUC coding for kana and
- kanji, however this can be converted to JIS or SJIS by any of the several
- conversion programs around. It is a text file with one entry per line. The
- format of entries is:
-
- KANJI [KANA] /english_1/english_2/.../
-
- or
-
- KANA /english_1/.../
-
- The English translations are deliberately brief, as the application of the
- dictionary is expected to be primarily on-line look-ups, etc.
-
- CONTENTS
- --------
-
- EDICT consists of:
-
- (a) the basic EDICT distributed with MOKE 2.0. This was compiled by MOKE's
- author, Mark Edwards, with assistance from Spencer Green. Mark has very kindly
- released this material to the public domain as part of EDICTJ. A number of
- corrections have been made to the MOKE original, e.g. spelling mistakes, minor
- mistranslations, etc. It also had a lot of duplications, which have been
- removed. It contained about 1900 unique entries. Mark Edwards has also kindly
- given permission for the vocabulary files developed for KG (Kanji Guess) to be
- added to EDICT.
-
- (b) additions by Jim Breen. I laboriously keyed in a ~2000 entry dictionary
- used in my first year nihongo course at Swinburne Institute of Technology years
- ago (I was given permission by the authors to do this). I then worked through
- other vocabulary lists trying to make sure major entries were not omitted. This
- task is continuing, although it has slowed down, and I suspect I will run out
- of energy eventually. Apart from that, I have made a large number of additions
- during normal MOKE and JREADER usage (e.g. using it to read fj.* news.)
-
- (c) additions by others. Many people have contributed entries and corrections
- to EDICT. I am forever on the lookout for sources of material, provided it is
- genuinely available for use in the Public Domain. I am especially grateful to
- Theresa Martin who has been supplying a lot of useful material, plus very
- perceptive corrections. Hidekazu Tozaki has also been a great help with
- tidying up a lot of awry entries, and helping me identify obscure kanji
- compounds. A full list is at the back of this file. A massive group of
- contributions came from Sony, where Rik Smoody had put together a large online
- dictionary.
-
- At this stage EDICT is nowhere as big as a good commercial dictionary, which
- typically has 20,000+ non-name entries with examples, etc. It is, however,
- bigger than some of the smaller printed dictionaries, and when used in
- conjunction with a search-and-display program like JDIC it provides an
- effective on-line dictionary service.
-
- COPYRIGHT?
- ----------
-
- A word on copyright. Of course most of the material in EDICT came from other
- published lists. Dictionary copyright is a difficult point, because clearly
- the first lexicographer who published "inu means dog" could not claim a
- copyright violation over all subsequent Japanese dictionaries. What makes each
- dictionary unique (and copyrightable) is the particular selection of words, the
- phrasing of the meanings, the presentation of the contents (a very important
- point in the case of EDICT), and the means of publication. The advice I have
- received from people who know about these things is that EDICT is just as much
- a new dictionary as any others on the market. Readers may see an entry which
- looks familiar, and say "Aha! That comes from the XYZ Jiten!". They may be
- right, and they may be wrong. After all there aren't too many translations of
- neko. Let me make one thing quite clear. NONE of this dictionary came from
- commercial machine-readable dictionaries. I have a case of RSI in my right
- elbow to prove it.
-
- Please do not contribute entries to EDICT which have come directly from
- copyrightable sources. It is hard to check these, and you may be jeopardizing
- EDICT's PD status.
-
- LEXICOGRAPHICAL DETAILS
- -----------------------
-
- EDICT is actually a Japanese->English dictionary, although the words within it
- can be selected in either language using appropriate software. (JDIC uses it to
- provide both E->J and J->E functionality.)
-
- The limitations on size inherent in the dictionary due to its current usage
- (MOKE scans it sequentially and JDXGEN, which is JDIC's index generator, needs
- to hold it in RAM) has meant that examples of usage cannot be included, and
- inclusion of phrases is very limited.
-
- No inflections of verbs or adjectives have been included, except in idiomatic
- expressions. Similarly particles are handled as separate entries. Adverbs
- formed from adjectives (-ku or ni) are not included. Verbs are, of course, are
- in the plain or "dictionary" form.
-
- In working on EDICT, bearing in mind I want to use it in MOKE and with JDIC, I
- have had to come up with a solution to the problem of adjectival nouns
- [keiyoudoushi] (e.g. kirei and kantan) and verbs formed by adding suru (e.g.
- benkyousuru). If I put entries in edict with the "na" and "suru" included,
- MOKE will not find a match when they are omitted or, the case of suru,
- inflected. What I have decided to do is to put the basic noun into the
- dictionary and add "(vs)" where it can be used to form a verb with suru, and
- "(an)" if it is an adjectival noun. Entries appear as:
-
- KANJI [benkyou] /study (vs)/
- KANJI [kantan] /simple (an)/
-
- Where necessary, verbs are marked with "(vi)" or "(vt)" according to whether
- they are intransitive or transitive. (Work on this aspect is continuing.) I
- have also used (id) to mark idiomatic expressions, (col) for colloquialisms,
- (pol) for teineigo, etc.
-
- USAGE
- -----
-
- EDICT can be used as the dictionary within MOKE simply by renaming it "EDICT",
- (or JTOE.DCT in the new version 2.1 of MOKE.) If you are a MOKE user and have
- been adding to your EDICT using the "Ask English?" option, you may wish to
- append your additions. Why not send them to me and I will add them to EDICT?
-
- EDICT can be used, with acknowledgement, for any purpose whatever, EXCEPT for
- inclusion in new commercial products. Mark Edwards can, of course, use it in
- later MOKE releases. Stephen Chung may also be using it in his PD "JWP".
-
- CONTRIBUTIONS
- -------------
-
- I will be delighted if people send me corrections, suggestions, and ESPECIALLY
- additions. Before ripping in with a lot of suggestions, make sure you have the
- latest version, as others may have already made the same comments.
-
- The preferred format for submissions is a JIS or EUC file (uuencoded for
- safety) containing replacement/new entries. Separate the amendments from the
- new material: e.g.
-
- **Amendments to EDICT yyyymmmdd Vyy-nnn**
-
- old entry1
- new entry1
- old entry2
- ........
-
- **New Entries**
-
- New entry1
- New entry2
- .........
-
- I prefer not to get a "diff" or "patch" file as the master edictj is under
- continuous revision, and may have had quite a few changes since you got your
- copy.
-
- ACKNOWLEDGEMENTS
- ----------------
-
- Mark Edwards, Spencer Green, Alina Skoutarides, Takako Machida, Theresa Martin,
- Satoshi Tadokoro, Stephen Chung, Hidekazu Tozaki, Clifford Olling, David
- Cooper, Ken Lunde, Joel Schulman, Hiroto Kagotani, Truett Smith, Mike Rosenlof,
- Harold Rowe, Al Harkom, Per Hammarlund, Atsushi Fukumoto, John Crossley, Bob
- Kerns, Frank O'Carroll, Rik Smoody, Scott Trent, Curtis Eubanks, Jamie Packer,
- Hitoshi Doi, Thalawyn Silverwood, Makato Shimojima, Bart Mathias, Koichi Mori.
-
- Jim Breen
- (jwb@capek.rdt.monash.edu.au)
- Department of Robotics & Digital Technology
- Monash University
- Caulfield East 3145
- AUSTRALIA
-
-
-